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Reconstructing the density fluctuations in the 
early Universe that evolved into the distribu- 
tion of galaxies wo-, see today is a challenge of 
modern cosmologyJil An accurate reconstruction 
would allow us to test cosmological models by 
simulating the evolution starting from the recon- 
structed state and comparing it to the obser- 
vations. Several .reconstruction techniques have 
been proposed,ETu but they all suffer from lack 
of uniqueness because the velocities of galaxies 
are usually not known. Here we show that re- 
construction can be reduced to a well-determined 
problem of optimisation, and present a specific al- 
gorithm that provides excellent agreement when 
tested against data from N-body simulations. By 
applying our algorithm to the new redshift sur- 
veys now under way,E3 we will be able to recover 
reliably the properties of the primeval fluctuation 
field of the local Universe and to determine ac- 
curately the peculiar velocities (deviations from 
the Hubble expansion) and the true positions of 
many more galaxies than is feasible by any other 
method. 

Starting from the available data on the galaxy distri- 
bution, can we trace back in time and map to its initial 
locations the highly structured distribution of mass in the 
Universe (Fig. |lj)? Here we show that, with a suitable hy- 
pothesis, the knowledge of both the present non-uniform 
distribution of mass and of its primordial quasi-uniform 
distribution uniquely determines the inverse Lagrangian 
map, defined as the transformation from present (Eule- 
rian) positions x to the respective initial (Lagrangian) 
positions q. 

We first consider the direct Lagrangian map q x, 
which can be approximately written in terms of a po- 
tential as x = Vq$(p), at those scales where nonlin- 
earity stays moderate.til This is supported by numeri- 
cal N-body simulations showing good agreement withj-a 
very simple potential approximation, due to Zel'dovich,E3 
which assumes that the particles move on straight tra- 
jectories. Even better agreement is obtained with a 




FIG. 1. iV-body simulation output (present epoch) used for 
testing our reconstruction method. In the standard model of 
structure formation, the distribution of matter in the Universe 
is believed to have emerged from a very smooth initial state: 
tiny irregularities of the gravitational potential, which we can 
still observe as temperature fluctuations of the cosmic mi- 
crowave background, gave rise to density fluctuations, which 
grew under their self-gravity, developing a rich and coherent 
pattern of structures. Most of the mass is in the form of cold 
dark matter; the luminous matter (galaxies) can be assumed 
to trace - up to some form of bias - the underlying dark mat- 
ter. Shown here is a projection onto the x-y plane of a 10% 
slice of the simulation box of size 200/i _1 Mpc. The model, 
ACDM, uses cold dark matter with cosmological constant and 
the following parameters: Hubble constant h = 0.65, density 
parameters Q\ = 0.7 and ilm = 0.3, normalization factor 
as = 0.99. Points are highlighted in yellow when reconstruc- 
tion fails by more than 6 ft -1 Mpc, which happens mostly in 
high-density regions. 

In our "reconstruction hypothesis" , we furthermore as- 
sume the convexity of the potential $(q), a consequence 
of which is the absence of multi-streaming: for almost 
any Eulerian position, there is a single Lagrangian an- 
tecedent. As is well-known, the Zel'dovich approxima- 
tion leads to caustics and to multi-streaming. This can 
be overcome in various ways, for example by a modifi- 
cation known as the adhesion modj&L an equation of vis- 
cous pressureless gas dynamicsJlZrlla The latter, which 
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leads to shocks ij&ther than caustics, is known to have a 
convex potentiate and to be in better agreement with 
N-body simulations. Suppression or reduction of multi- 
streaming requires a mechanism of momentum exchange, 
such as viscosity, between neighbouring streams having 
different velocities. This is a common phenomenon in 
ordinary fluids, such as the flow of air or water in our 
natural environment. Dark matter is however essentially 
collisionless and the usual mechanism for generating vis- 
cosity (discovered by Maxwell) does not operate, so that 
a non-collisional mechanism involving a small-scale grav- 
itational instability must be invoked. 

Our reconstruction hypothesis implies that the initial 
positions can be obtained from the present ones by an- 
other gradient map: q = V x 6(x), where is a convex 
potential related to $ by a Legendre-Fenchel transform 
(see Methods). We denote by po the initial mass den- 
sity (which can be treated as uniform) and by p(x) the 
final one. Mass conservation implies pod 3 q = p{~x)d 3 x. 
Thus, the ratio of final to initial density is the Jacobian 
of the inverse Lagrangian map. This .-can be written as 
the following Monge-Ampere equationE3 for the unknown 
potential Q 

det (V B4 V BJ 0(x)) =p(x)/po, (1) 

where 'det' stands for determinant. 

We emphasize that no information about the dynam- 
ics of matter other than the reconstruction hypothe- 
sis is needed for our method, whose degree of success 
depends crucially on how well this hypothesis is satis- 
fied. Exact reconstruction is obtained, for example, for 
the Zel'dovich approximation (before particle trajecto- 
ries cross) and for adhesion-model dynamics (at arbitrary 
times). 

We note that our Monge-Ampere equation for self- 
gravitating matter may be viewed as a nonlinear gen- 
eralisation of a Poisson equation (used for reconstruction 
in ref. |J) , to which it reduces if particles have moved very 
little from their initial positions. 

It has been discovered recently that the map generated 
by the solution to the Monge-Ampere equation^R]) is the 
(unique) solution to an optimisation problemtll (see also 
refs p2| and |23|) . This is the 'pass transportation' problem 
of Monge and KantorovichE5t3 in which one seeks the 
map x^q that minimises the quadratic 'cost' function 

I = f po|x- q| 2 d 3 q = f p(x)|x - q| 2 d 3 x. (2) 

J q J x 

Note that x — q is forbidden: as the initial and final den- 
sity fields po and p(x) are prescribed there is a constraint 
on Jacobian of the map (see Methods). 

Next, we take into account that information on the 
mass distribution is provided in the form of N discrete 
particles both in simulations and when handling observa- 
tional data from galaxy surveys. The cost minimisation 
then becomes what is known in optimisation theory as 
the assignment problem: find the unique one-to-one pair- 
ing of a set of TV initial points q/s and N final points Xj's 



that minimises Idiscr = J2i=i l x i — cb(«)| 2 - An immediate 
consequence is that, for any subset of k pairs of initial 
and final points (2 < k < N), the contribution of these 
points to the cost function should not decrease under ar- 
bitrary permutations of initial points. This property is 
known to be equivalent to having a Lagrangian map that 
is the gradient of a convex function.c3 

If we restrict ourselves to interchanging just pairs 
(k = 2), the map is said to be monotonic, a condition 
not equivalent to minimisation of the cost function (ex- 
cept in one dimension) . In ref. ^J, a method of reconstruc- 
tion called the Path Interchange Zel'dovich Approxima- 
tion (PIZA) is introduced which uses the same quadratic 
cost function (obtained by applying a minimum-action 
argument within the framework of the Zel'dovich approx- 
imation). In PIZA, a randomly chosen tentative corre- 
spondence between initial and final points is successively 
improved by swapping pairs of initial particles whenever 
this decreases the cost function. Eventually, a mono- 
tonic map is obtained which usually does not minimise 
the cost. This explains the non-uniqueness of PIZA re- 
construction (also noticed in ref. pf). 
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FIG. 2. Tests of MAK reconstructions of the Lagrangian 
positions, using the data shown in Fig. |l| The dots near 
the diagonal are a scatter plot of reconstructed initial points 
vs. simulation initial points for the coarsest 6.25 h~ x Mpc grid 
with 17,178 points. The scatter diagram uses a 'quasi-periodic 
projection' coordinate q = (gi + 52 v2 + qs y/S) / ( 1 + V2 + , 
which guarantees a one-to-one correspondence between 
(/-values and points on the regular Lagrangian grid. The up- 
per left inset is a histogram (by percentage) of distances in re- 
construction mesh units between such points; the first slightly 
darker bin (whose width was taken to be slightly less than one 
mesh) corresponds to perfect reconstruction (thereby allowing 
a good determination of the peculiar velocities of galaxies); 
the lower right inset is a similar histogram for reconstruction 
on a finer 3.12 ft -1 Mpc grid using 19,187 points. With the 
6.25/i -1 Mpc grid, 62% of the 17,178 points are assigned per- 
fectly and about 75% are within not more than one mesh. 
With the 3.12/i -1 Mpc grid, we have 34% of exact reconstruc- 
tion out of 19,187 points. On further refinement of the mesh 
by a factor two, this degrades to 14%. 
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There are, however, known deterministic strategies for 
the assignment problem which give the correct unique so- 
lution; their complexity (dependence on N of the num- 
ber of operations needed) is close to ./V 3 for arbitrary 
cost functions, but can be sharply reduced when the 
cost function is quadratic. Combining the organisation 
of data taken from Henon's mechanical analogue ma- 
chinecJ for solving the assignpient problem with the dual 
simplex method of Balinski,E3 we have designed an al- 
gorithm which gives the optimal assignement for about 
20,000 particles in a few hours of CPU on a fast Al- 
pha machine. For historical reasons we call our ap- 
proach Monge-Ampere-Kantorovich or MAK (see Meth- 
ods). Details of the algorithms will be given elsewhere; 
we merely note that, when working with catalogues of 
several hundred thousands of galaxies expected within 
a few years, a direct application of the assignment al- 
gorithm in its present state could require unreasonable 
computational resources. A mixed strategy can however 
be used, in which the assignment problem is solved on a 
coarse grid while, on smaller scales, the Monge- Ampere 
equation (hf) is solved by a relaxation technique (adapted 
from ref. E3t). 

We tested the MAK reconstruction on data obtained 
by a cosmological A-body simulation with 128 3 parti- 
cles, using the HYDRA codeS (Fig. @). Reconstruction 
was performed on three 32 3 grids with (comoving) meshes 
given by Ax — 6.25 hr 1 Mpc, Ax/2 and Ax/4, where h is 
the Hubble constant in units of 100 km s _1 Mpc -1 . In co- 
moving coordinates, the typical displacement of our mass 
elements over one Hubble time is about ten ft, -1 Mpc. 
We discarded those points that, at the end of the sim- 
ulation (present epoch), were not within a sphere con- 
taining about 20,000 points, a number comparable to 
that of currently available all-sky galaxy redshift cata- 
logues. As the simulation assumes periodic boundary 
conditions, we also took into account periodicity when 
calculating the distance between pairs of points. The 
MAK reconstructions were used to generate a scatter di- 
agram and various histograms allowing comparisons of 
simulation and reconstructed Lagrangian points (Fig. ||). 
The results demonstrate the essentially potential charac- 
ter of the Lagrangian map above ~ 6/i~ 1 Mpc (within 
the given cosmological model) . 

We also performed PIZA reconstructions on the coars- 
est grid and obtained typically 30-40% exactly recon- 
structed points, but severe non- uniqueness: for two dif- 
ferent seeds of the random generator only about half of 
the exactly reconstructed points were the same. 

When reconstructing from observational data, in red- 
shift space (Fig. ||), the galaxies appear displaced radially 
(as seen by the observer) by an amount proportional to 
the radial component of the peculiar velocity. We thus 
performed another reconstruction, with an accordingly 
modified cost function, that led to somewhat degraded 
results (Fig. ||) but at the same time provided an ap- 
proximate determination of peculiar velocities. More ac- 
curate determination of peculiar velocities can be done 
using second-order Lagrangian perturbation theory. The 



effect of the catalogue selection function can be handled 
by standard techniques; for instance one can assign each 
galaxy a 'mass' inversely proportional to the catalog se- 
lection functionEm 

What is the smallest length scale at which an opti- 
misation algorithm such as MAK can be expected to 
give a unique and reliable reconstruction? The key in- 
gredient here is the simultaneous knowledge of the ini- 
tial and present mass density fields. MAK-type recon- 
struction (with a suitable cost based on the equation 
of a self-gravitating fluid) should therefore be possible 
down to scales comparable to the thickness of collapsed 
structures, below which the hydrodynamical description 
ceases to be meaningful. 




FIG. 3. Reconstruction test in redshift space with the same 
data as for the real-space reconstruction tested in the upper 
left histogram of Fig. |^. The circular redshift map (violet 
points) corresponds to the same real-space slice as displayed 
in Fig. |l| The observer is taken to be at the center of the 
simulation box. Points used for reconstruction within the 
displayed slice are highlighted in red. Reconstruction is per- 
formed by the MAK algorithm with a different cost function, 
obtained (as in ref. |^) by assuming that the peculiar veloc- 
ities v can be estimated by the Zel'dovich approximation: 
v — /(x — q), where / ~ fi^ 6 « 0.49. Note that we now have 
43% of exactly reconstructed points, included among the 60% 
which are within not more than 6.25 h" 1 Mpc from their cor- 
rect positions. 

The fact that MAK guarantees a unique solution and 
that our present reconstruction hypothesis proved to be 
very faithful down to 6.5 h~ l Mpc makes our method vera, 
promising for the analysis of galaxy redshift surveys .113 
Reconstruction of the primordial positions and velocities 
of matter will allow us to test the Gaussian nature of the 
primordial perturbations and the self-consistency of cos- 
mological hypotheses, such as the choice of the global cos- 
mological parameters and the assumed biasing scheme. 
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By obtaining a point-by-point reconstruction of the spe- 
cific realisation that describes the observed patch of our 
Universe, we can distinguish between universal proper- 
ties and the influence of the large-scale environment on 
the galaxy formation process. Moreover, reconstruction 
will open a new window not only onto the past but also 
into the present Universe: it would enable us to make 
a first-time determination of the peculiar velocities of a 
very large number of galaxies, using their positions in 
redshift catalogues. 

Methods 

Monge— Ampere equation 

The Lagrangian map q t— > x is taken to be the gradient of a convex 
potential $(q); therefore its inverse x i— » q also has a potential 
representation q = V x ©(x), where 0(x) is again a convex function; 
the two potentials are Legendre-Fenchel transforms of each other 
(see ref. BCj, pp. 61-65): 

©(x) = max [q ■ x — 5>(q)] ; §(q) = max [x • q — 0(x)] . (3) 

q X 

The potential satisfies the Mpnge— Ampere equation (|l]), written 
for the first time by Ampere; 20 ! by exploiting the property of the 
Legendre transformation. Note that within the more restricted 
framework of the Zeldovich approximation, © differs just rby a 
quadratic additive term from the Eulerian velocity potential. E3 

Quadratic cost function 

To show that the quadratic cost minimisation leads to the Monge- 
Ampere equation, we define the displacement field £(x) = x — q(x) 
and perform a variation <5£(x) to obtain, to lowest order, the 
variation of the cost function 81 = J 2£(x) • (p(x) <5£) d?x. 
The condition that the Eulerian density remains unchanged, which 
constrains the variation, is expressed as V x ■ (p( x )<5£( x )) = 0. By a 
simple Lagrange multiplier argument, this implies that £ must be a 
gradient of some function of x; thus, q = x— £ = V x ©(x). Further- 
more, should © be non-convex and thus lead to multi-streaming, 
this would prevent the Lagrangian map from being optimal. 

History of mass transportation 

Monger 4 ! posed the following problem: how to optimally move ma- 
terial from one place to another, knowing only its initial and final 
spatial distributions, the cost being a prescribed function of the 
distance travelled by 'molecules' of p*atcrial (a linear function in 
Monge's original work). KantorovichP 5 ! showed that Monge's query 
was an instance of the linear programming problem and developed 
for it a theory which found numerous applications in economics and 
applied mathematics. 
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